Proximal Nodes: a Model to Query Document Databases by Contents and Structure

نویسندگان

GONZALO NAVARRO

Gonzalo Navarro

چکیده

A model to query document databases by both their content and structure is presented. The goal is to obtain a query language which is expressive in practice while being eeciently imple-mentable, features not present at the same time in previous work. The key ideas of the model are a set-oriented query language based on operations on nearby structure elements of one or more hierarchies, together with content and structural indexing and bottom-up evaluation. The model is evaluated regarding expressiveness and eeciency, showing that it provides a good trade-oo between both goals. Finally, it is shown how to include in the model other media diierent from text. 1. INTRODUCTION Document databases are deserving more and more attention, due to their multiple applications: digital libraries, ooce automation, software engineering, automated dictionaries and encyclopedias, etc. Frakes and Baeza-Yates 1992] The purpose of a document database is to store documents, structured or not. A document database is composed of two parts: content and (if present) structure. The content is the data itself, while the structure relates diierent parts of the database by some criterion. Any information model for a document database should comprise three parts: data, structure, and query language. It must specify how is the data seen (i.e.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML

As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...

متن کامل

مدل جدیدی برای جستجوی عبارت بر اساس کمینه جابه‌جایی وزن‌دار

Finding high-quality web pages is one of the most important tasks of search engines. The relevance between the documents found and the query searched depends on the user observation and increases the complexity of ranking algorithms. The other issue is that users often explore just the first 10 to 20 results while millions of pages related to a query may exist. So search engines have to use sui...

متن کامل

Expressive Power of a New Model for Structured Text Databases

This paper studies the expressivity of a new model for structuring and querying textual databases by both the structure and contents of the text. The key idea of the model is a set-oriented query language based on operations on proximal nodes. This model has been shown to be eeciently implementable, and the aim of this paper is to show that it is competitive in expressivity with models whose im...

متن کامل